Boltzmann-Shannon Entropy:Generalization and Application 



C.G Chakrabarti^E and I.Chakrabarty 2 - 3 '! 

'Department of Applied Mathematics, Calcutta University, Kolkata 700009, India 
2 Heritage Institute of Technology, Kolkata,! 00 101 , India 
3 Bengal Engineering and Science University, Howrah, W.B, India 
(Dated: February 9, 2008) 

The paper deals with the generalization of both Boltzmann entropy and distribution in the light 
of most-probable interpretation of statistical equilibrium. The statistical analysis of the generalized 
entropy and distribution leads to some new interesting results of significant physical importance. 
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INTRODUCTION 

Boltzmann was first to provide the statistical definition 
of entropy linking the concept of entropy with molecular 
disorder or chaos 1 . Boltzmann entropy is the key to the 
foundation of statistical mechanics and is, in fact, the ba- 
sis of all statistical concepts of entropy. The concept of 
probability which is vital for a statistical theory, however, 
has not come out clear with Boltzmann entropy. For, the 
thermodynamic probability or statistical weight appear- 
ing in Boltzmann entropy, is not a probability, it is an 
integer. The statistical equilibrium as defined by Boltz- 
mann and Planck to be the most probable state achieved 
by maximizing the thermodynamic probability brought 
with it certain opaqueness 2 .The object of the present 
paper is to modify Boltzmann entropy in order to intro- 
duce the notion of probability distribution in Boltzmann 
statistics. In this objective we have first considered a 
classical system and have reduced the Boltzmann entropy 
to the form of Shannon entropy 3 , not in terms of prob- 
abilities, but in terms of occupation numbers of different 
energy states of the system. This form of entropy is called 
Boltzmann-Shannon entropy has been modified in the 
light of most probable interpretation of statistical equilib- 
rium. The modified entropy called Boltzmann-Shannon 
cross entropy has led two important results. The first is 
the probability distribution of the macro states consis- 
tent with Einstein's inversion of Boltzmann principle 3 . 
The second is the equivalence of information and negen- 
tropy consistent with the Bernoulli's negentropy princi- 
ple of information 4 . The most probable interpretation 
of statistical equilibrium has led to a generalized form 
of Boltzmann distribution involving prior probabilities. 
The appearance of prior probabilities makes the results 
interesting for both physical and non physical systems. 



BOLTZMANN-SHANNON ENTROPY AND 
PROBABILITY 

Boltzmann entropy of a system is defined by, 

S = k\nW (1) 



where k is the Boltzmann constant and W, called the 
thermodynamic probability or statistical weight, is the 
total number of microscopic states or complexions com- 
patible with the macroscopic state of the system. The 
thermodynamic probability W appearing in Boltzmann 
entropy (1) is not a probability, it is an integer. We may 
however, ask for the probability P(A n ) of any macro- 
scopic state A n (say). This probability may be repre- 
sented as the fraction representing the ratio of the sta- 
tistical weight W n to the sum of statistical weights of 
all the macroscopic states that are compatible with the 
given constraints: 



P(A n ) 



Since for large W n 2 

{W n ) max = Wtotal =Yl Wn 

the probability (2) may be replaced by 
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where S max is the maximum value of the entropy S.It 
was this form of Boltzmann principle that was used suc- 
cessfully by Einstein 5 in his study of thermodynamic 
fluctuations and its various applications. In this way Ein- 
stein introduced the probability distribution by inverting 
Boltzmann principle. Note that in this approach entropy 
comes first and probability comes later on. 
In the present paper we shall follow a different path. 
We shall first introduce the probability distribution of 
macrostates and find the expression of entropy consis- 
tent with the general mathematical theory of entropy 6 . 
Before we do that we consider a classical system and re- 
duce the Boltzmann entropy to the form of entropy, not 
in terms of probabilities but in terms of occupation num- 
bers of the different energy states of the system. Let 
the system under consideration consists of N molecules 
classified into n energy states E% (i — 1, 2, n) with cor- 
responding occupation numbers Ni(i — 1,2, ...n). The 
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system is assumed to be isolated system characterized by 
fixed values of the total energy and number of molecules: 



^Ni = N{fixed) 

n 



(5) 



The macroscopic state of the system is given by the set 
of occupation numbers A n = [N\, N2, N n ]. Thus 
the statistical weight of the macroscopic state A n = 
[N u N 2 ,....,N n ] is given by 

representing the total number of microscopic states of 
the system. For large Ni(i = 1, 2...n), using Stirling's ap- 
proximation, Boltzmann entropy S with statistical weight 
(6) is reduced to the form 



S = k\n 



nr=i^! 



-kN^pi Inpi 



(7) 



where Pi = jj- is the relative frequency and for large N, 
it is the probability that a molecule lies in the ith energy 
state Ei. The expression 



H(pi,p 2 , ....,Pn 



-k^Pi In ^ 

i=l 



(8) 



appearing in the right hand side of (7) is the Shannon 
entropy measuring the uncertainty associated with the 
probability distribution (pi, P2, ...,p n ) 3 - Thus for large 
classical system Boltzmann entropy is proportional to the 
Shannon entropy and as such the Shannon entropy de- 
fined by (8) is also a measure of molecular disorder of the 
system. In terms of occupation numbers the expression 
(7) can be written as 



S = -k^NMNi + klnN 

i=i 



(9) 



The second term in the right hand side of (9) is a con- 
stant for constant number of molecules constituting the 
system. So for variational purpose this constant may be 
dropped and we can write Boltzmann entropy for classi- 
cal system in the form 



-k^Ndv-Ni 

i=l 



(10) 



Note that (10)has the same functional form of Shannon 
entropy (8) and is defined over the non probabilistic 
distribution [N\, N2, ....N n ]. Due to this similarity with 
Shannon entropy (8) we call it Boltzmann-Shannon 
entropy. In the next section we are going to generalize 
the Boltzmann-Shannon entropy (10) along with its 
physical or thermodynamic significance. 



PROBABILITY-DISTRIBUTION OF 
MACROSTATES: BOLTZMANN-SHANNON 
CROSS-ENTROPY 

According to Boltzmann and Planck the thermody- 
namic equilibrium is defined as the most-probable state. 
The thermodynamic probability W(N\, N2, N n ) of 
the macrostate A n = [N\, N2, N n ] is not a proba- 
bility, it is an integer. So the thermodynamic equi- 
librium obtained by the maximization of thermody- 
namic probability W(N\, N 2 , N n ) or equivalently 
Boltzmann entropy (1) may lead to some confusions 
2 . To find out the most-probable state we have 
to determine first the probability distribution of the 
macrostate A n = [N\, N2, N n ] on the basis of the 
prior information or data. The occupation numbers 
[N\,N2, ...,N n ] are assumed to be a set of random vari- 
ables in view of the many-body aspect of the system. 
Let P{A n ) = P[N 1 ,N 2 , ...,N n } be the probability distri- 
bution of [Ni, N2, N n ]. Let the mean or averages of 
occupation numbers [Ni,N 2 , N n ] be known: 



J2 N i P(N 1 ,N2,...,N n ) 

RN,n 



N, 



(11) 



where i = l,2...n and Rn,u is the set of non negative 
integers satisfying the condition, 



JVi +iV 2 + .... + N n = N 



(12) 



The mean value Ni(i = 1, 2...n) given by (11) constitute 
constraints about the system. Note that the thermody- 
namic probability or statistical weight W(Ni,N 2 , ■-,N n ) 
given by (6) is the prior information about the macrostate 
A n = [N\, N2, N n ] of the system .An appropriate mea- 
sure of uncertainty or entropy about the system is given 
by Bayesian entropy 7 . 

Our problem is to estimate the probability distribution 

P(Ni,N2, ,N n ) under the prior information (6) and 

the constraints (11). We can do this by the generalization 
of Jaynes' Maximum-entropy principle 8 . According to 
Jaynes 8 the best estimate of the probability distribution 

P(A^i,A r 2, , N n ) corresponds to the maximization of 

the Bayesian entropy (13) subject to the constraints (11) 
and the normalization condition: 



P(N U N 2 , ,N n ) = l 



(14) 



The best estimate of the probability, P(N\, N2, , N n ) 

is then given by the multinomial distribution 7,9 



P{N U N 2 , ,N n ) = 



ChiV, 



(15) 
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where pf (i = 1,2, ...n) is the prior probability that a 
molecule lies in the i-th energy state and it is deter- 
mined from the prior information or constraint (11). In 
the existing literature the multinomial distribution of 
the macrostate A n = [N\,N 2 , N n ] is usually assumed 
without any physical justification. We have, however, 
provided an information-theoretic method based on gen- 
eralized maximum-entropy principle which takes account 
of the prior available constraints and information about 
the system. 

Assuming N(i = 1,2, ...,n) to be very large and using 
Stirling's approximation we can reduce the logarithm of 
the probability P(N\,N 2 , ,N n ) to the form 



klogP(N u N 2 , ,JV n ) 



-NkJ2 Pi ln^ (16) 



where pi = jf, (i = 1,2, ...,n) and for the large N, 
it is the probability that a molecule lies in i-th energy 
state £j. Note that (16) is a generalization of (3) and 
is in fact, the measure of relative entropy. Again ex- 
cept the multiplication constant (—Nk) the expression 
(16) is known as Kullback-Leibler relative information 
or simply Kullback cross-entropy giving a measure of di- 
rected divergence between the probability distributions 
\pi,,P2,-,p n ] and [p 1 ,,p9,,...,p° n ] 10 . 
Let us now transform (16) in terms of the occupation 
numbers [Ni, N 2 , ...N n ]. The priori probabilities p® are 
the parameters of the multinomial distribution (15). As 
we have stated p° are to be determined in terms of 
the available constraints (11) that is, in terms of Ni 
(i = 1,2, ...,n). An unbiased estimate of pf is given by 
Pi = Tv • Then replacing pi by ^ and p° by ^ , we have 



k In P(N U N 2 , ,N n ) = -k J2 Ni In 



i=\ 



N%_ 



(17) 



which is a generalization of Boltzmann-Shannon entropy 
(10). It is, in fact, a measure of relative entropy defined 
over the non- negative integers. Since P(N\, N 2 , .., N n ) < 
1 the quantity (17) is, however , negative. We shall call 
the negative of (17) that is , the quantity 

" N- 

-klnP(N u N 2 , ,N n ) = kJ2N i ]n^- (18) 

i— 1 1 

as the generalized Boltzmann entropy. The left hand side 
of (18)is the probabilistic entropy of the macrostate A n — 
[N±, N 2 , N n ]. The right hand side has the same form 
as that of Kullback cross-entropy 10 and due to this sim- 
ilarity we shall call this expression Boltzmann-Shannon 
cross-entropy defined over the set of non-negative integers 
[Ni,N 2 , JV n ].In the following we are going to study its 
physical or thermodynamical significance. 
Let us assume that the averages Ni (i = 1,2, ..,n) cor- 
respond to the thermodynamic equilibrium values of Ni 



(i = 1, 2, ..n), then it is easy to show that 



li 



N, 



-kJ2Niln^ = S-S, 



i=l 



equil 



(19) 



where S is the entropy of the system at non equilibrium 
state [Ni, N 2 , N n ] and S equ u be that of equilibrium 
state [N U N 2 , ...,N n }. From (18) and (19) we have, 



- fc!nP(JVi, JV 2 , N n ) = S equil - S 



(20) 



The left hand side of (20) which we have stated to 
represent the probabilistic entropy of the macrostate 
[N\, N 2 , N n ] is also the measure of information ob- 
tained about macrostate [N\,N 2 , iV n ] after its realiza- 
tion 6 . The right hand side of (20) is the negentropy of 
the system at the state [N\, N 2 , N n ]. The relation (20) 
thus implies the equivalence of information and negen- 
tropy and is consistent with the Brillouin's negentropy 
principle of information 4 . The relation (20) also pro- 
vides another important result. From (20) we can write 
the probability of macrostate [Ni, N 2} N n ] as 



P[N u N 2 ,...,N n ]=exp 



S S e q U il -I 

k 1 



(21) 



consistent with Einstein's result (4) obtained by inverting 
Boltzmann principle. 



GENERALIZED BOLTZMANN DISTRIBUTION 
AND APPLICATIONS 

In Boltzmann statistics the distribution law of thermo- 
dynamic equilibrium is determined by maximizing the 
thermodynamic probability W(N\, N 2 , ...N n ) given by 
(6) or cquivalently the Boltzmann-Shannon entropy (10) 
subject to the constraints (5). The entropy (10) subject to 
the constraints (15). The maximization yields the Boltz- 
mann distribution 



Pt 



e-? E - 



where 



z(/?) = E' 



-0Ei 



(22) 



(23) 



and the parameter may be identified with the inverse 
temperature by the relation ft = p^, T being the ab- 
solute temperature of the system. According to most 
probable interpretation of the statistical equilibrium the 
maximum of the probability P(Ni,N 2 , ..,N n ) or equiv- 
alently lnP(Ni, N 2 , .., N n ) given by (16) subject to the 
constraints corresponds to the statistical equilibrium. We 
have then the generalized Boltzmann distribution, 



Pi=P°i[ 



Z{ftJ 



(24) 
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where 



z(/?) = E' 



(25) 



The difference with the usual or old Boltzmann distribu- 
tion (22) comes out from the multiplicative factor p®, the 
prior probability. The appearance of prior probabilities 
makes the problem complex and it is difficult to deter- 
mine the prior probabilities in statistical mechanics or in 
any other branch of science 12 13 . If no state is more 
preferable to other, it is then reasonable to assume that 
all the prior probabilities are equal to one another so 
that p° = ^, (i = 1, 2, ..n). This is Laplace's principle of 
insufficient knowledge 13 14 . According to Jaynes' 8 this 
is the state of maximum prior ignorance. In the case of 
equal prior probabilities the most -probable distribution 
reduces to the form 



Pi = 



1 e-P E * 
«2l 



(26) 



which except the multiplicative constant ^ is the 
Boltzmann-distribution derived earlier. For most- 
probable state of thermodynamic equilibrium with equal 
priori probabilities \ the entropy (7) becomes 



where (i = 1,2,..) and h is Planck's constant. When the 
prior probabilities are all equal then the total energy of 
the system is is given by 



„ hv hv 

E = 1 

2 eP hv - 1 



(31) 



Now if the collection is of two dimensional harmonic os- 
cillators, the situation becomes different. In this case 
prior probabilities p° increases linearly with i, so that we 
can write 16 



Pi 



i 

C 



(32) 



where (i = 1,2,..) and C is a necessary to make p\ prob- 
ability in true sense. In the case of two dimensional os- 
cillators 16 



Ei = ihv 



(33) 



where (i=l,2,..) and a bit of calculation gives the total 
energy of the system as 16 



E = hv 



Ihv 



e 0hv _ ^ 



(34) 



where the zero-energy level has been changed from that 
of linear harmonic oscillators. 



Sequil = -kN^p.lnp, 

i=l 

= kN[\nn+ {/3E + In Z (/?))] 



(27) 



On the other hand, with unequal prior probabilities, the 
entropy of thermodynamic equilibrium corresponding to 
the probabilities distribution (24) is given by 



u equil 



A^>>°lnp° + iVfc(/3£ + lnZ(/3)) (28) 



Since, Inn > — J27=iPi m P?' we nave 

Sequil > S e quil 



(29) 



implying that the thermodynamic equilibrium with un- 
equal prior probabilities pi does not correspond to the 
maximum entropy or maximum disorder of the system. 
This is a violation of the existing physical law and is due 
to unequal prior probabilities 15 . 

We now consider a physical problem where unequal prior 
probabilities appear and make things different from those 
with equal prior probabilities. Let us consider a collec- 
tion of N linear harmonic oscillators all with frequency v. 
The energy levels of a linear harmonic oscillator is given 

by 



Ei 



(i - \)hv 



(30) 



CONCLUSION: 

In the present paper we have made an attempt to gen- 
eralize both Boltzmann entropy and Boltzmann distribu- 
tion in the light of Boltzmann Planck most-probable in- 
terpretation of statistical equilibrium. We have obtained 
some interesting new results different from the old ones 
and tried to find out their physical significance. Let us 
state some of the main results along with their merits. 

(i) The present method of determination of the probabil- 
ity distribution of macrostates is more direct and trans- 
parent than the old method of inverting Boltzmann prin- 
ciple. 

(ii) Boltzmann-Shannon cross-entropy obtained as a gen- 
eralization of Boltzmann-Shannon entropy is defined over 
the set of non-negative integers [N\, 7V 2 , ..,iV n ]. The rela- 
tion (18), however shows that it has probabilistic mean- 
ing and is consistent with the probabilistic foundation of 
entropy and information. 

(iii) The most-probable interpretation of statistical equi- 
librium leading to the generalized Boltzmann distribu- 
tion (24) involves prior probabilities pf. The appearance 
of prior probabilities makes the result interesting differ- 
ent from the existing ones, sometimes in violation of the 
existing physical laws, for example, the maximum en- 
tropy or disorder for statistical equilibrium 15 . 

(iv) The importance of prior probability in a physical 
system namely a collection of two dimensional harmonic 
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oscillators has been investigated. Prior probabilities also 
play a significant role in statistical mechanical modelling 
of ecosystems 17 . 

Boltzmann-Shannon entropy is a classical one, its gen- 
eralization, however, leads to some new results of im- 
portant physical significance. Finally, the mathemati- 
cal simplicity of the paper which is independent of any 
mechanical or statistical models and postulates 18 is an 
advantageous point of the theory. 
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